Data Repository Organization and Recuperation Process for Multilingual Lexical Databases1
نویسندگان
چکیده
This paper describes the data management in multilingual lexical databases. Since NLP systems are using lexical data, the amount of work to build them is huge. That is the reason why it is important to use rigorous powerful systems and to be able to get data for a minimal cost. As an open source project, Papillon is reusing existing lexical data and wants to make volunteers collaborate. To make it possible, it combines state of the art concepts in the field of linguistic and computing.
منابع مشابه
Multilingual Lexical Network from the Archives of the Digital Silk Road
We are describing the construction process of a specialized multilingual lexical resource dedicated for the archive of the Digital Silk Road DSR. The DSR project creates digital archives of cultural heritage along the historical Silk Road; more than 116 of basic references on Silk Road have been digitized and made available online. These books are written in various languages and attract people...
متن کاملBuilding Specialized Multilingual Lexical Graphs Using Community Resources
We are describing methods for compiling domain-dedicated multilingual terminological data from various resources. We focus on collecting data from online community users as a main source, therefore, our approach depends on acquiring contributions from volunteers (explicit approach), and it depends on analyzing users’ behaviors to extract interesting patterns and facts (implicit approach). As a ...
متن کاملFrom Resources to Applications. Designing the Multilingual ISLE Lexical Entry
The ISLE Computational Lexicon Working Group is committed to the consensual definition of a standardized infrastructure to develop multilingual resources for HLT applications. In particular, the ISLE-CLWG pursues this goal by designing MILE (Multilingual ISLE Lexical Entry), a general schema for the encoding of multilingual lexical information. This has to be intended as a meta-entry, acting as...
متن کاملThe Habanera Lexical Knowledge Base Management System
Habanera is a multipurpose multilingual lexical knowledge base that is developed at CRL to be used as a central repository of multilingual lexical data. The knowledge base contains a set of dictionaries and relations between entries, within a dictionary (e.g., synonymy) as well as between entries of different dictionaries (e.g., translation). The format of monolingual lexical entries is left re...
متن کاملInterchanging Lexical Information for a Multilingual Dictionary
OBJECTIVE To facilitate the interchange of lexical information for multiple languages in the medical domain. To pave the way for the emergence of a generally available truly multilingual electronic dictionary in the medical domain. METHODS An interchange format has to be neutral relative to the target languages. It has to be consistent with current needs of lexicon authors, present and future...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002